Journal of Proteome Research — Latest Matching Preprints

1

Global Proteomics Investigation of SAMT-247 Targets: An Antiviral Thioester that Acetylates Zinc Finger Proteins

Jewell, C. P.; Perciaccante, A. J.; Brown, K.; Maity, T. K.; Dinan, J. C.; Bissa, M.; Rahman, M. A.; Franchini, G.; Appella, D. H.; Jenkins, L. M.

2026-04-30 cell biology 10.64898/2026.04.28.721345 medRxiv

Top 0.1%

56.4%

Show abstract

Covalent modification of target proteins is a well-established mechanism of action for small molecule inhibitors. Cysteine residues in particular have been exploited for their reactivity toward electrophilic molecules. SAMT-247 is a mercaptobenzamide thioester that covalently acetylates cysteines in the zinc-coordinating domains of the HIV nucleocapsid protein. This SAMT-247-promoted reaction leads to loss of zinc binding by the protein, with concomitant loss of protein structure and function. Although it has low cytotoxicity in animal models, recent studies have indicated that it affects other protein targets in uninfected cells, for example leading to increased immune cell functions. In this study, global proteomics approaches have been used to better understand other protein targets of SAMT-247. Minimal effects are observed when unstimulated THP-1 monocyte cells were treated with SAMT-247. In contrast, thermal proteome profiling identified 170 proteins with altered thermal stability when THP-1 cells were stimulated with phorbol 12-myristate 13-acetate/Ionomycin (PMA/Iono) before SAMT-247 treatment. Among the affected proteins, 81 contain a zinc-coordinating domain and/or have been shown to have a reactive cysteine residue. Among these, several play a role in cellular metabolism, and Seahorse assays demonstrated that SAMT-247 significantly increased the anti-metabolic and pro-glycolytic effect of PMA/Iono in THP-1 cells. Two of the most-affected proteins were ZC3H7A, a microRNA-binding protein with four zinc finger domains, and MGMT, a DNA damage repair protein with a reactive cysteine. Both proteins were modified by SAMT-247 when tested alone or in the presence of THP-1 cell lysate, indicating that they are bona fide targets of the inhibitor. The low activity of SAMT-247 in unstimulated THP-1 cells is consistent with its low cytotoxicity. The increased effects of SAMT-247 in stimulated immune cells suggests that this molecule could be developed to target diseases other than HIV.

2

Digging deeper into the immunopeptidome with TripleToolWF

Mayer, R. L.; Mechtler, K.

2026-06-14 biochemistry 10.64898/2026.06.11.731513 medRxiv

Top 0.1%

52.9%

Show abstract

While the field of immunopeptidomics has matured substantially over the last years, high input amounts of cellular or tissue material are still required to obtain a somewhat complete profile of the immunopeptidome. Here we present a simple platform termed TripleToolWF (derived from Triple Tool workflow) to increase the number of identified and quantified immunopeptides combining the outputs of three search engines such as PEAKS Online 12, Sequest HT with INFERYS rescoring and MSFragger. For assessing the false discovery rate (FDR) an entrapment approach is used. The platform improved peptide identifications by 6-14% and peptide quantitations by 11-25% compared to the best individual search engine for two independent, previously published, bacterial infection datasets. Peptides were mostly 9-12mers as expected and >90% of the obtained 9mers were predicted binders by the stringent majority voting approach of Immunolyser 2.0 which indicates high confidence of the identified immunopeptides. The FDR was monitored using dedicated entrapment searches against shuffled databases. The resulting entrapment FDR was assessed before and after result pooling and showed only a minor increase upon pooling compared to the worst individual search engine. It remained even below the target of 1% peptide FDR in 40% of the experiments. Compared to the original publications, the number of high confidence bacterial immunopeptides was drastically elevated by 53% and 2800% for the Listeria monocytogenes and Mycobacterium bovis BCG projects, respectively, when applying strict filters. Of these additional bacterial sequences, all 9mer sequences were predicted as binders by at least one of the prediction algorithms of Immunolyser 2.0 illustrating their actual HLA binding nature. TripleToolWF hence provides a simple tool to further increase the number of obtained sequences from MS-based immunopeptidomics experiments to facilitate a deeper view of the immunopeptidome for refined vaccine candidate prioritization.

3

Trypsin exhibits exopeptidase-like activity toward N-terminal arginine that biases proteomic analyses

Ambrose, E. A.; Kandasamy, G.; Meulener, M. M.; Zhang, F.

2026-05-16 biochemistry 10.64898/2026.05.15.725550 medRxiv

Top 0.1%

52.9%

Show abstract

Many proteomics protocols rely on enzymatic digestion of complex protein mixtures to generate peptides with predictable cleavage patterns for the mass spectrometry analysis. One of the most utilized enzymes, trypsin, is classically defined as a serine endopeptidase with high specificity for cleaving peptide bonds on the C-terminal side of internal lysine and arginine residues. Accordingly, trypsin is not expected to remove the N-terminal arginine, which may arise through posttranslational modification such as arginylation or by proteolysis exposing internal residues as the new N-termini. N-terminal arginine plays important biological roles, including functioning as an N-degron and modulating protein interactions/signaling through its positive charge. Curiously, prior mass spectrometry-based studies utilizing trypsin to identify proteins bearing N-terminal arginine have frequently reported low and inconsistent yields, suggesting potential systematic bias in current proteomic approaches. Here, we explored whether trypsin would affect the integrity of the N-terminal arginine. By using antibodies specifically recognizing N-terminal arginine of different peptides, and by using mass spectrometry peptide analysis, we show that trypsin can remove N-terminal arginine residues in an exopeptidase-like manner. This effect occurs across a range of digestion conditions consistent with standard proteomic workflows, on peptides or whole proteins, and depends on trypsin concentration, incubation time, and catalytic activity. In addition, we show that the alternative arginine-cleavage enzyme Arg-C can also affect N-terminal arginine in a sequence-dependent context. In contrast, Lys-C and LysargiNase do not exhibit such effects, providing suitable alternative digestion strategies. Together, these findings reveal an unappreciated enzymatic behavior of arginine-cleaving proteases and suggest that their widespread use may systematically compromise the detection of N-terminal arginine in proteomic studies.

4

Hidden Structural Bias in Proteomics: Sonication-induced Selective Fragmentation of Intrinsically Disordered Regions

Narita, M.; Yamakawa, T.; Nishimura, R.; Iwasaki, M.

2026-07-15 cell biology 10.64898/2026.07.14.738389 medRxiv

Top 0.1%

52.6%

Show abstract

Sonication is a fundamental technique in proteome sample preparation, primarily used for protein solubilization and shearing of genomic DNA. Although the mechanical shearing of DNA is well-characterized, its unintended impact on protein structural integrity remains a significant "blind spot" in high-throughput analytical workflows. In this study, we systematically investigated sonication-induced protein fragmentation by combining gel-based fractionation (PEPPI-MS) with sequence-level compositional analysis and bioinformatic mapping. Our results demonstrate that sonication does not significantly alter overall proteome identification or the recovery of membrane proteins; however, it induces extensive and non-random protein fragmentation. Sonication caused an approximately three-fold increase in the abundance of >45 kDa protein-derived fragments migrating into the <40 kDa fraction, and 1,620 high-molecular-weight (MW) proteins were uniquely detected in the lower-MW fraction upon sonication, an eight-fold increase over non-sonicated controls. Peptide-level amino acid composition analysis revealed subtle but directional shifts in the sonication-derived fragments. This residue-level signature is reinforced by two orthogonal structural analyses (MobiDB peptide-level mapping and protein-level profiling using metapredict V3 software), which show that sonication-susceptible proteins harbor more than twice the disordered content of length-matched controls (median 40% vs. 18%). This study identifies a previously unrecognized "structural bias" whereby intrinsically disordered region (IDR)-rich proteins are selectively compromised during sample preparation. Because these fragments are indistinguishable from enzymatic digestion products in conventional bottom-up proteomics, the underlying structural damage is effectively masked in global quantitative datasets, potentially distorting biological interpretations related to protein size, isoforms, and stability, particularly for IDR-rich classes, such as transcription factors and signaling molecules. We propose that optimizing and standardizing sonication parameters is essential for ensuring the accuracy and reproducibility of quantitative proteomic analyses.

5

Systematic optimization and benchmarking of synchro-PASEF for high-throughput phosphoproteome profiling

Brademan, D.; Mullarkey, A.; Greeson, M.; Szvetecz, S.; Vitek, O.; Blythe, E.; Huttenhain, R.

2026-06-27 biochemistry 10.64898/2026.06.26.734570 medRxiv

Top 0.1%

52.5%

Show abstract

High-throughput data-independent acquisition (DIA) workflows paired with short chromatographic separations are increasingly adopted for systems biology and clinical proteomics. However, narrower peak widths from rapid separations demand faster mass spectrometer cycle times to maintain quantitative depth and reproducibility. The synchro-PASEF acquisition mode on timsTOF mass spectrometers diagonally scans across ion mobility and m/z space, enabling efficient sampling of the precursor ion cloud with shortened cycle times. While synchro-PASEF has demonstrated competitive identification depth for global protein abundance samples compared to conventional dia-PASEF, its performance for phosphoproteomics - where the precursor ion cloud is characteristically broader and bimodally distributed - has not been evaluated. Here, we systematically optimized synchro-PASEF methods for phosphoproteomics and benchmarked performance against two dia-PASEF methods across three sub-hour separations. We found that synchro-PASEF performance depends critically on balancing diagonal window number, total isolation width, and gradient length, with longer gradients favoring more windows for selectivity and shorter gradients favoring fewer windows to preserve sampling frequency. An optimized configuration quantified over 19,000 localized phosphosites using a 23-minute separation. Retention time summation (RTsum) with a factor of 2 increased phosphopeptide identifications by 5-20% and reduced phosphosite-level coefficients of variation by up to 30% across all dia-PASEF and synchro-PASEF methods tested. Using {beta}2-adrenergic receptor (B2AR) activation as a signaling model, we demonstrate that label-free DIA phosphoproteomics can be used to model phosphoproteomics dose-response relationships, showing that synchro-PASEF and dia-PASEF produce highly concordant phosphoproteomic responses, with comparable numbers of responding phosphosites, similar effect sizes, and nearly identical predicted protein kinase A (PKA) substrates downstream of the activated B2AR. While synchro-PASEF did not surpass optimized dia-PASEF in identification depth, its comparable biological performance and amenability to post-acquisition optimization through RTsum support its utility for high-throughput phosphoproteomics. This work provides a transferable framework for synchro-PASEF method optimization and demonstrates the broad utility of retention time summation for PASEF-based phosphoproteomics workflows.

6

MassSpectrum Analyzer: An interactive platform for proteomic searching parameter refinement and peptide modification focused re-scoring

Karlic, K. I.; Scott, N. E.

2026-06-28 bioinformatics 10.64898/2026.06.22.733873 medRxiv

Top 0.1%

45.2%

Show abstract

Peptide spectrum annotation is critical for the assignment of peptides and the localisation of modifications. While many existing tools provide spectrum annotation capacities, they often lack the flexibility required to allow bespoke spectral annotation of peptides containing multiple labile modifications or the accurate assignment of peptides in which fragmentation deviates from canonical patterns. In these cases, user-guided annotation is widely used to improve assignment completeness, however it typically does not integrate peptide scoring, making it challenging to assess the empirical improvement of the associated annotation and its impact on downstream false-discovery rate estimations. Here, we introduce an interactive annotation environment, the 'MassSpectrum Analyzer', which aims to streamline the exploration and analysis of modified peptides by enabling user-defined customisation with peptide scoring. Using (2-Aminoethyl)trimethylammonium carboxyl-derivatised peptides and glycopeptides as case studies we demonstrate the capacity of the MassSpectrum Analyzer to rapidly explore and allow the assessment of modified peptide datasets. By enabling direct assessment of the impact of user-guided choices on peptide scoring, we show how the detection of highly modified peptides can be improved through post-search integration of modification fragmentation information in a statistically robust manner. Similarly, by permitting comparisons of peptide ion intensities across spectra, we show that global fragmentation patterns can be quantified allowing the interrogation of trends that only become clear when spectra are assessed en masse. Combined, the MassSpectrum Analyzer streamlines the generation of publication-ready spectra and provides a means to assess how the inclusion of annotated features influences assignment scores.

7

onsite: An Integrated Framework for Phosphosite Localization and False Localization Rate Estimation

Yue, Q.-X.; Wei, Z.; Dai, C.; Bai, M.; Perez-Riverol, Y.; Sachsenberg, T.

2026-07-11 bioinformatics 10.64898/2026.07.08.737157 medRxiv

Top 0.1%

45.2%

Show abstract

With the rapid development of mass spectrometry-based proteomics, the volume of phosphoproteomic data has increased substantially. However, accurate localization of phosphorylation sites and standardized statistical validation remain critical analytical bottlenecks. To address the lack of standardized cross-algorithm evaluation, we introduce onsite, a unified and open-source Python framework. onsite integrates an alanine-decoy strategy to estimate the false localization rate (FLR) across three algorithms: AScore, PhosphoRS, and pyLucXor. This modular architecture efficiently processes large-scale datasets and enables global FLR calculation. Benchmarking on the standard synthetic phosphopeptide dataset PXD000138 highlighted distinct inter-algorithmic variations. Using the same 5% global FLR threshold, pyLucXor localized the most target sites (28,353). It also reached a high accuracy (91.22%) against the known ground truth, resulting in the largest number of correctly localized sites (25,865). Reanalysis of the highly fractionated, large-scale PXD012255 dataset further demonstrated that native integration of onsite into the quantms pipeline enables scalable processing and provides a standardized framework for FLR control in large-scale phosphoproteomics. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=64 SRC="FIGDIR/small/737157v1_ufig1.gif" ALT="Figure 1"> View larger version (14K): org.highwire.dtl.DTLVardef@e4c85dorg.highwire.dtl.DTLVardef@1e8464org.highwire.dtl.DTLVardef@185cea1org.highwire.dtl.DTLVardef@1c0d1bc_HPS_FORMAT_FIGEXP M_FIG C_FIG

8

Quantitative profiling of JMJD6-catalysed lysine hydroxylation reveals residue-dependent oxygen sensitivity

Kesavan, P.; Stuermer, S. M.; Räbel, K.; Popp, O.; Mertins, P.; Cockman, M. E.; Sugimoto, Y.

2026-06-08 biochemistry 10.64898/2026.06.08.730680 medRxiv

Top 0.1%

42.4%

Show abstract

Lysine hydroxylation is increasingly recognised as a widespread post-translational modification in human cells, with more than 100 modified sites identified to date. Jumonji domain-containing protein 6 (JMJD6) catalyses hydroxylation at multiple lysines within lysine-rich regions and is a major contributor to this modification across the proteome. As JMJD6 requires oxygen as a co-substrate, lysine hydroxylation has been proposed to couple oxygen availability to cellular function. However, the biological significance of this modification remains incompletely understood, in part due to technical challenges associated with the detection of hydroxylation within lysine-rich regions by mass spectrometry. To address limitations of conventional approaches, we systematically evaluated key steps in the analysis of proteomic data from lysine-derivatised samples and developed a workflow for comprehensive, accurate, and quantitative analysis of lysine hydroxylation. Methodological improvements included optimisation of database search strategies to increase peptide coverage in lysine-rich regions and incorporation of immonium ion signatures to substantially improve confidence in hydroxylysine identification. We further demonstrated that stoichiometry derived from peptide precursor ion intensity faithfully captures hypoxia-responsive changes in lysine hydroxylation at amino acid resolution. Application of this workflow to bromodomain (BRD) proteins - epigenetic readers containing lysine-rich regions extensively hydroxylated by JMJD6 - revealed marked heterogeneity in the apparent kinetics of hydroxylation among target lysines, with evidence of interdependence between neighbouring sites. Hypoxia suppressed hydroxylation in a site-dependent manner, with greater suppression observed at sites displaying slower rates of hydroxylation. Together, the development and application of this workflow establish a methodological and biological framework for understanding how oxygen availability regulates protein function through lysine hydroxylation.

9

Ground Truth-Based Evaluation of False Discovery Rate and Statistical Power in DIA Proteomics

Yarbro, J. M.; Huang, Y.; Pagala, V.; Fu, Y.; Wang, Z.; Wu, L.; Wang, X.; High, A. A.; Byrum, S.; Peng, J.; Yuan, Z.-F.

2026-06-02 bioinformatics 10.64898/2026.05.29.728747 medRxiv

Top 0.1%

39.3%

Show abstract

Data-independent acquisition (DIA) mass spectrometry enables rapid proteomic quantification, yet the reliability of statistical inference in DIA-based protein quantification remains incompletely understood. Here, we systematically evaluated missingness, false discovery rate (FDR), and statistical power, defined as true positive rate (i.e. sensitivity or recall), using technical replicates and a spike-in benchmark with known ground truth. Analysis of 18 HeLa replicates revealed persistent, abundance-dependent missingness. In the spike-in experiment with five replicates, human peptides were titrated against a stable yeast background, allowing fold changes (FCs) to be compared with expected values. Across comparisons with log2FCs ranging from 0.2 to 2.5, the nominal BH-FDR substantially underestimated the true FDR. For example, at a BH-FDR threshold of 0.05, the true FDR was [~]0.2. Statistical power was [~]40% for a log2FC of 0.2 and increased to nearly 100% for a log2FC of 2.5. Additional incorporation of FC thresholds improved the true FDR for large-FC comparisons, with slight loss of power, but markedly reduced sensitivity for small-FC comparisons. Together, these results indicate that nominal FDR does not necessarily reflect actual error rates in DIA proteomics and that DIA performance is influenced by protein abundance and expected fold changes. This study provides a framework for experimental design and data interpretation in DIA-based proteomic studies.

10

Reference-Based Library Construction Improves Performance in low-input diaPASEF Workflows

Charkow, J.; Ghaznavi, M.; Seale, B.; Peng, J.; Gingras, A.-C.; Rost, H.

2026-05-04 bioinformatics 10.64898/2026.04.29.721088 medRxiv

Top 0.1%

38.8%

Show abstract

In low input mass spectrometry-based proteomics, Data Independent Acquisition (DIA), including diaPASEF, is quickly becoming the method of choice for label free quantification. Whether using empirical or in silico spectral libraries, performance is dependent on the library; however, the optimal library construction strategy for low input proteomics remains an open question. To address this, we examine and develop library construction approaches that are compatible with both spectrum-centric and peptide-centric analysis workflows. These approaches leverage a closely related, high-quality sample to improve library quality. First, we validated our approach in bulk sample amounts where we observed that the effects of gas-phase fractionation based library construction is dependent on the software framework, with improvements more pronounced in OpenSWATH compared to DIA-NN. In OpenSWATH, our peptide-centric library reconstruction workflow consistently outperforms a transfer learning strategy, an emerging alternative approach. In DIA-NN, trends are dependent on library source highlighting OpenSWATHs stronger dependence on the search space. In low-input applications, such as single-cell-equivalent injection amounts (100 pg) of HeLa cell digest on a timsTOF SCP, our library construction approach provided more pronounced improvements across both software tools compared to bulk samples. Using a peptide-centric reconstruction approach with the OpenSWATH analysis framework, we detected over 15,000 peptide precursors (2480 protein groups), a 90% improvement over the original library. Furthermore, using a spectrum-centric construction approach, peptide precursor identification rates improved over 6-fold ([~]1000 to [~]6000). Our strategy provides a practical solution for generating high-quality libraries in low-input applications.

11

Sample preparation for mass spectrometry-based tissue (phospho)proteomics

Sander, S.; Bayramoglu, I.; Stumpe, M.; Restivo, G.; Levesque, M.; Dengjel, J.

2026-06-18 biochemistry 10.64898/2026.06.17.732915 medRxiv

Top 0.1%

38.7%

Show abstract

This protocol describes the workflow for the preparation of tissue samples for proteome and phosphoproteome analyses using mass spectrometry. The tissue samples are cryogenically pulverized and homogenized in a sucrose-based buffer to ensure proper tissue disruption. For depletion of lipid contaminants, proteins are purified using chloroform-methanol precipitation, followed by a resuspension in a urea-based buffer for enzymatic digestion. Peptides are desalted and enriched for phosphopeptides prior LC-MS/MS analysis. The workflow was developed for skin biopsies but is compatible with a broad range of tissue types. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=51 SRC="FIGDIR/small/732915v1_ufig1.gif" ALT="Figure 1"> View larger version (21K): org.highwire.dtl.DTLVardef@6a44aforg.highwire.dtl.DTLVardef@c34dc9org.highwire.dtl.DTLVardef@27d7ecorg.highwire.dtl.DTLVardef@1d1038e_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical abstractC_FLOATNO C_FIG

12

ProtPen combines sequence- and structure-based approaches to facilitate protein function predictions on a proteome-wide scale

Mathai, D.; Schulze, S.

2026-07-11 bioinformatics 10.64898/2026.07.11.737882 medRxiv

Top 0.1%

38.4%

Show abstract

Proteins of unknown function represent a significant gap in our understanding of biological processes, encompassing large portions of the proteomes of many organisms, especially prokaryotes. Addressing this gap is critical to understanding the biology and pathogenicity of such organisms. We introduce ProtPen, an open-source pipeline that facilitates protein function prediction by combining eggNOG-mapper for sequence-based annotation with Foldseek for rapid structural similarity searches using AlphaFold-predicted protein structures. Annotation results from both tools are merged and enriched with UniProt metadata to produce a comprehensive output suitable for downstream analysis. The pipeline requires only a FASTA input file with UniProt identifiers, and is designed to analyze datasets on the scale of whole proteomes. Benchmarking on a curated dataset of well-characterized Pseudomonas aeruginosa proteins demonstrated an annotation accuracy of >90%, and highlighted the complementarity of sequence- and structure-based methods. Further evaluation of ProtPen included its application to biologically relevant datasets, comprising proteins of unknown function that exhibited significant differential abundances in a proteomics dataset of P. aeruginosa, and uncharacterized glycoproteins from Haloferax volcanii. ProtPen is readily extensible to incorporate additional protein function prediction tools. In summary, this pipeline facilitates the systemwide annotation of proteins of unknown function from proteomic datasets and whole proteomes. For Table of Contents Only O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=98 SRC="FIGDIR/small/737882v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@1011179org.highwire.dtl.DTLVardef@1222493org.highwire.dtl.DTLVardef@8f69f2org.highwire.dtl.DTLVardef@174b30e_HPS_FORMAT_FIGEXP M_FIG C_FIG

13

High-Speed Mass Spectrometers diminish the difference between Data-Dependent and Data-Independent Acquisition Proteomics

O'Sullivan, N.; Bayer, F. P.; Mogler, C.; Kuster, B.

2026-05-28 biochemistry 10.64898/2026.05.26.727836 medRxiv

Top 0.1%

34.4%

Show abstract

Data-dependent acquisition mass spectrometry (DDA-MS) and data-independent acquisition mass spectrometry (DIA-MS) have historically offered complementary strengths in bottom-up proteomics, with DDA providing high-selectivity spectra for post-translational modification (PTM) analysis and DIA enabling more systematic peptide sampling. Here, we asked if this is still the case for the Orbitrap Astral platform that offers high-speed DDA and (ultra-) narrow-window DIA (nDIA) capabilities across proteome and phosphoproteome applications. When DDA and DIA measurements were parameter-matched (to the extent possible), the differences in analytical performance diminished markedly. Across extensive replicate analyses, both methods continued to identify new peptides and proteins without reaching saturation, indicating that the molecular complexity of biological samples still overwhelms even the fastest liquid chromatography-MS (LC-MS) methods. Incomplete sampling also contributed to substantial peptide-level non-overlap between DDA and nDIA and data completeness was only modestly better for nDIA than DDA across many replicates. Quantitatively, DDA and nDIA showed broadly similar precision and accuracy, with nDIA offering slightly higher precision and DDA slightly better accuracy in controlled mixture experiments. MS1-based quantification outperformed MS2-based quantification, particularly for short gradients, supporting MS1 quantification as a robust and general strategy for high-throughput proteomics. In phosphoproteomic samples, DDA and nDIA identified similar numbers of phosphopeptides, but DDA retained a small edge for phosphorylation site localisation. Together, the results show that advances in acquisition speed and sensitivity are narrowing the historical gap between DDA and DIA, while also revealing that current LC-MS workflows remain far from providing comprehensive proteome coverage. Going forward, further gains in dynamic range, scan speed, sensitivity, and transparent software tools will be required to reach systematic, comprehensive and reliable measurements of complex proteomes in a single shot.

14

Kinetic Lipidomics: Quantifying in vivo changes in lipid metabolism using metabolic labeling

Nielsen, C.; Denton, R.; Driggs, B.; Gates, S.; Hilton, T.; Naylor, B.; Quilling, C.; Virgin, K.; Cutler, K.; Sorensen, M.; Poulson, M.; Snedaker, P.; Hernandez, Z.; Transtrum, M.; Price, J. C.

2026-07-01 biochemistry 10.64898/2026.06.29.735310 medRxiv

Top 0.1%

34.3%

Show abstract

Lipid metabolism reflects the dynamic balance between metabolic turnover and concentration. Kinetic mass spectrometry (MS) enables direct quantification of molecular turnover in vivo. Previous work has shown that MS-based kinetic proteomics has provided powerful insights into proteome regulation. Analogous lipidome-wide kinetic measurements remain limited by challenges in defining molecule-specific labeling behavior. Here, we extend kinetic MS to untargeted lipidomics. Isotope labeling with deuterated water (2H2O) is commonly used for monitoring turnover of palmitate and other select lipids by measuring labeling of stable CH positions with deuterium (2H). Here, we extend the deuterium-incorporation model underlying these targeted lipid turnover assays to support untargeted analysis of all detectable lipids. This allows us to empirically quantify the effective fraction of endogenous synthesis (Asyn) and the turnover rate (k) across hundreds of lipid species simultaneously. One central barrier to lipidome-wide kinetic modeling is determining the endogenous number of deuterium-labeling sites for each molecule (nL) which is required to estimate Asyn and k accurately. The nL value is an essential component of biological kinetic assays. In kinetic proteomics, curated amino acid nL libraries enable peptide-level modeling by summing sequence-specific labeling-site values, but comparable resources are lacking for lipids and may not generalize across metabolic states or non-mammalian systems. Yet, gaps remain for lipids and for amino acids in modified metabolic conditions or non-mammalian biologies. Here, we empirically determine lipid nL values and validate the process with peptides against an nL library. To evaluate this strategy in a biologically relevant setting, we applied it to brain tissue from transgenic mice expressing human ApoE isoforms, where altered lipid transport and metabolism are implicated in Alzheimers disease risk. These data validate the method in a clinically relevant context and suggest that genotype-dependent metabolism can alter empirically determined lipid nL values.

15

Confident Identification and Quantification of Mouse Brain Tissues Reveals Sirtuin 5-Dependent Regulation

Landgrave-Gomez, J.; Bons, J.; Vega-Hormazabal, G.; Riley, R.; Schilling, B.; Verdin, E.

2026-05-28 cell biology 10.64898/2026.05.26.726073 medRxiv

Top 0.1%

33.9%

Show abstract

Methylmalonylation is a non-enzymatic lysine post-translational modification derived from methylmalonyl-CoA, a reactive intermediate that accumulates during mitochondrial dysfunction and branched-chain amino acid catabolism. Although reported in models of methylmalonic acidemia, its broader distribution and functional relevance remain largely unexplored. Progress has been hindered by a key analytical challenge: methylmalonyl-and succinyl-lysine are isobaric (+100.0160 Da) and generate overlapping mass spectrometric fragmentation spectra, preventing confident identification in conventional proteomic workflows. Here, we establish a straightforward proteomic workflow that overcomes this barrier and enables confident identification and quantification of lysine methylmalonylation by combining antibody-based enrichment with data-independent acquisition mass spectrometry (DIA-MS). Anti-malonyl antibodies were used to enrich methylmalonylated peptides through cross-reactivity. Using synthetic peptide standards containing malonyl-, succinyl-, or methylmalonyl-lysine, we defined distinguishing analytical features including chromatographic retention time, ion mobility, and fragmentation patterns. Applying this approach to mouse brain tissues from Sirtuin-5 (SIRT5) knockout and wild-type mice, we identified 44 methylmalonylated peptides across 41 proteins, enriched in neuronal and myelin-associated proteins (NEFM, NEFL, MBP) and mitochondrial enzymes such as ADT1. Several sites were increased in SIRT5-deficient brains, consistent with regulation by this mitochondrial deacylase. Functional assays demonstrated that methylmalonylation of myelin basic protein (MBP) impairs lipid binding, linking this modification to myelin stability. Together, this workflow enables confident methylmalonylation identification and defines it as a widespread and regulated modification in the brain, providing a framework to study metabolically driven protein acylation in neurobiology and disease. SignificanceLysine methylmalonylation has remained largely unexplored due to its isobaric overlap with succinylation, which prevents confident identification using conventional proteomic workflows. Here, we establish an integrated strategy combining antibody-based enrichment, data-independent acquisition mass spectrometry, and orthogonal analytical features to resolve these modifications with high confidence. Applying this approach to mouse brain tissue reveals a SIRT5-regulated methylmalonylome enriched in mitochondrial and myelin-associated proteins, including myelin basic protein (MBP). Functional assays demonstrate that methylmalonylation impairs MBP lipid binding, linking this modification to myelin stability. Beyond this specific application, our workflow provides a generalizable framework to resolve isobaric post-translational modifications and expands the study of metabolically driven protein acylation in neurobiology and disease.

16

Enhanced proteome relative quantification using refined quantotypic spectral libraries

Barnes, B. A.; Alharbi, H.; Unwin, R.

2026-07-10 bioinformatics 10.64898/2026.07.06.736793 medRxiv

Top 0.1%

33.3%

Show abstract

Plasma proteomics is used for a variety of applications including biomarker discovery, disease monitoring, and drug development. Data-independent acquisition (DIA) has vastly improved the breadth of proteins that are identified from samples; however, given challenges in reproducibility and translation, it is critical that the quantitative performance of these methods is reliable. Analysis of global proteomics data typically incorporates information from all detected peptides. However, some peptides do not reflect their parent protein amount, due to irreproducible digestion, modification, analytical interferences or instability. We hypothesise that including these peptides impacts protein relative quantification, and thus, a refined spectral library containing only quantitatively representative peptides provides superior protein quantification. By analysing a defined multi-species spike-in model, we show that refining a plasma spectral library by removing precursors that fail to meet quality control metrics (25.4% of all identified precursors) reduces noise and variability, improving precision, accuracy and differential abundance analysis by up to [~]11%, with minimal identification losses and substantial reduction in computational demand. This demonstrates proof-of-concept that refining spectral libraries produces results that prioritize quantification quality over quantity. This approach could enable development of universal tissue-specific refined spectral libraries able to improve quantification quality with easy implementation and minimal processing time. Significance of the StudyAs DIA mass spectrometry proteome depth increases, the quality of the associated protein quantifications must be considered alongside identification breadth, particularly in complex matrices such as plasma, which presents additional technical challenges. The spectral library used for protein identification and quantification is a critical determinant of DIA performance, and its composition requires considerable consideration. This work illustrates an initial step toward improving protein quantification starting at the spectral library level by filtering precursors which are poor quantitative representatives of their parent proteins. In doing so, the resulting data is more reliable for downstream and biological interpretation, with fewer false differential abundance assignments and reduced quantitative noise. As such, this work represents a broader shift away from the habitual focus of MS workflows on maximising the number of protein and differential abundance identifications and instead prioritises the quality of quantification over quantity. These initial findings lay the groundwork for further development of spectral library refinement strategies, with the potential to continue improving the accuracy and precision of protein quantification in DIA-based proteomics.

17

LAMPrEY: a Python-based automated quality control tool for large-scale proteomics datasets

Valdes-Tresanco, M. E.; Wacker, S.; Valdes-Tresanco, M. S.; Plakhotnyk, A.; Brodie, N. I.; Hepburn, M.; Ulke-Lemee, A.; Huttlin, E. L.; Lewis, I. A.

2026-05-11 bioinformatics 10.64898/2026.05.06.722826 medRxiv

Top 0.1%

31.7%

Show abstract

Over the past years, proteomics has moved increasingly towards the analysis of large cohorts of biological specimens. This has been made possible by significant improvements in mass spectrometry technology, chromatographic separation methods, and improved data acquisition strategies. These technological advances now routinely enable experiments that yield vast datasets that substantially outstrip the capacity of existing proteomics data analysis approaches. Processing such large datasets requires purpose-built, quality control tools designed to organize and analyze the data while recording all processing parameters for reproducibility. To address this need, we developed an open-source, Python-based software platform, Large-scale Automated Multi-level Proteomics Evaluation by Python (LAMPrEY), a comprehensive quality-control pipeline for quantitative proteomics analyses of large cohorts of samples. LAMPrEY features GUI-based file submission, automated processing with MaxQuant and RawTools, an interactive analytics dashboard, and an application programming interface (API) for programmatic usage that collectively enable rapid, reproducible analysis and interpretation of proteomics data. We demonstrate the longitudinal monitoring and analytical capabilities of LAMPrEY using TMT11 quantitative proteomics data generated from 910 Enterococcus faecium isolates collected from bloodstream infection patients. LAMPrEY is an open-source software that can be accessed at www.lewisresearchgroup.org/software.

18

Systematic Characterization of Thermal Stability Assay Parameters and Application in Discovery of Peptide-Protein Interactions

Richards, D. M.; zhai, F.; Li, S.; Yu, Q.

2026-05-08 biochemistry 10.64898/2026.05.06.723354 medRxiv

Top 0.1%

31.6%

Show abstract

Thermal proteome profiling (TPP) and its higher-throughput derivative, the proteome integral solubility alteration (PISA) assay, measure changes in protein thermal stability upon ligand binding or other perturbations and have been widely adopted in drug discovery and biomedical research. Though the PISA workflow is straightforward, key parameters, including detergent concentration, methods for removing denatured aggregates, and temperature range selection, vary across studies and can markedly influence assay outcomes. Yet these factors have not been systematically evaluated, limiting rational experimental design and data interpretation. Here, through a combined use of TPP, PISA, tandem mass tag (TMT)-based multiplexing, and computational simulation, we systematically characterize these parameters based on the melting behavior of [~]9,000 proteins. We find that reducing detergent concentration elevates apparent Tm by 1.5-2{degrees}C proteome-wide, and aggregate removal by filtration versus centrifugation further alters measurements. We leverage these observations to optimize PISA then apply the optimized conditions to identify the aminopeptidase NPEPPS as a previously uncharacterized binding partner of angiotensin II, a key vasoactive peptide hormone in blood pressure regulation. Together, this work provides a general framework for assay design and data interpretation, and extends the utility of PISA beyond small molecules to dissecting peptide-protein interactions, an increasingly important modality in drug discovery.

19

Near-Zero Missed Cleavages with a High-Fidelity Recombinant Arg-C Zero for Mass Spectrometry-Based Proteomics

Hernandez-Rollan, C.; Elsborg, J. D.; Le Boiteux, E.; Lu, Y.; Patel, K.; Ahel, I.; Jensen, O. N.; Batth, T. S.; Olsen, J. V.

2026-05-28 biochemistry 10.64898/2026.05.28.728370 medRxiv

Top 0.1%

30.9%

Show abstract

Proteolytic digestion remains a critical step in bottom-up proteomics workflows, with enzyme specificity and efficiency directly impacting peptide identification and protein sequence coverage. Here, we present the comprehensive characterization of Arg-C Zero, a recombinant arginyl endopeptidase derived from Porphyromonas gingivalis that exhibits exceptional fidelity in cleaving specifically at the C-terminus of arginine residues. Unlike conventional serine proteases such as Trypsin, Arg-C Zero utilizes a histidine-cysteine catalytic dyad mechanism, achieving near-zero missed cleavage rates (>99% efficiency) under standard proteomics conditions. Through systematic evaluation using HeLa protein extracts, we demonstrate that Arg-C Zero maintains consistent performance across varying digestion times. The enzyme shows robust activity across a broad pH range and tolerates up to 4M urea, making it ideally suitable for a diverse range of proteomics sample preparation workflows. While Trypsin/LysC combinations remain superior for comprehensive proteome coverage, Arg-C Zero offers unique advantages for applications requiring high specificity and reproducible arginine-specific cleavage patterns, particularly for analysis of post-translational modifications (PTMs). Here, we demonstrate how Arg-C Zero aids comprehensive mapping of histone PTMs, and when used in low-pH workflows help preserve labile ADP-ribosylation sites, expanding the analytical capabilities of mass spectrometry for characterizing these challenging modifications. The enzymes resistance to proline-adjacent cleavage sites and compatibility with standard mass spectrometry buffers position it as a valuable addition to the proteomics enzyme toolkit.

20

The Single Cell Proteomic blueprint, navigating instrumentation platforms, software tools and high-load libraries in neutrophils, RKO and A549 cells

Brenes, A. J.; Mayer, R. L.; Makar, A.; Coelho, P.; van Stralen, G.; Sadiku, P.; Walmsley, S. R.; Matzinger, M.; Mechtler, K.; von Kriegsheim, A.

2026-06-16 biochemistry 10.64898/2026.06.12.731618 medRxiv

Top 0.1%

30.7%

Show abstract

Mass spectrometry-based single cell proteomics (SCP) is rapidly emerging as a powerful approach for biological research, with applications extending beyond in-vitro cancer cell lines. Recent advances make it possible to apply SCP to ex-vivo human cells from tissues such as the brain and pancreas, as well as to technically challenging immune populations such as neutrophils. However, these analyses remain more challenging and typically result in reduced proteomic coverage. To support the development of robust workflows for SCP data acquisition and analysis, we systematically evaluated multiple DIA search engines, search engine settings, the inclusion of high-load library samples in single-cell search spaces, the impact of contaminants, and the quantitative properties of identified proteins. These comparisons were performed across two major instrumentation platforms, Orbitrap Astral and timsTOF SCP, and across A549, RKO cells and neutrophils, three cell types differing in size and protein content. Our work here provides guidelines on the software parameters to use for SCP, instrument specific results and cell dependent optimizations of high-load libraries, as well as novel evaluation of the quantitative properties of proteins for single cell and low input proteomics.